Extending MARIE: an N-gram-based SMT decoder
نویسندگان
چکیده
In this paper we present several extensions of MARIE1, a freely available N -gram-based statistical machine translation (SMT) decoder. The extensions mainly consist of the ability to accept and generate word graphs and the introduction of two new N -gram models in the loglinear combination of feature functions the decoder implements. Additionally, the decoder is enhanced with a caching strategy that reduces the number of N -gram calls improving the overall search efficiency. Experiments are carried out over the Eurpoean Parliament Spanish-English translation task.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملN-gram-based machine translation enhanced with neural networks for the French-English BTEC-IWSLT'10 task
Neural Network Language Models (NNLMs) have been applied to Statistical Machine Translation (SMT) outperforming the translation quality. N -best list rescoring is the most popular approach to deal with the computational problems that appear when using huge NNLMs. But the question of “how much improvement could be achieved in a coupled system” remains unanswered. This open question motivated som...
متن کاملCan Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
The phrase-based and N-gram-based SMT frameworks complement each other. While the former is better able to memorize, the latter provides a more principled model that captures dependencies across phrasal boundaries. Some work has been done to combine insights from these two frameworks. A recent successful attempt showed the advantage of using phrasebased search on top of an N-gram-based model. W...
متن کاملReordered Search and Tuple Unfolding for Ngram-based SMT
In Statistical Machine Translation, the use of reordering for certain language pairs can produce a significant improvement on translation accuracy. However, the search problem is shown to be NP-hard when arbitrary reorderings are allowed. This paper addresses the question of reordering for an Ngram-based SMT approach following two complementary strategies, namely reordered search and tuple unfo...
متن کاملAlgoritmo de Decodificación de Traducción Automática Estocástica basado en N-gramas
In this paper we describe MARIE, an N -gram-based stochastic machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an N gram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a...
متن کامل